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Abstract: Insert ion/delet ion (indel) mutations, which are represented by gaps in multiple 
sequence alignments, have been used to examine phylogenetic hypotheses for some time. 
However, most analyses combine gap data with the nucleotide sequences in which they are 
embedded, probably because most phylogenetic datasets include few gap characters. Here, 
we report analyses of 12,030 gap characters from an alignment of avian nuclear genes 
using maximum parsimony (MP) and a simple maximum likelihood (ML) framework. 
Both trees were similar, and they exhibited almost all of the strongly supported relationships 
in the nucleotide tree, although neither gap tree supported many relationships that have 
proven difficult to recover in previous studies. Moreover, independent lines of evidence 
typically corroborated the nucleotide topology instead of the gap topology when they 
disagreed, although the number of conflicting nodes with high bootstrap support was hmited. 
Filtering to remove short indels did not substantially reduce homoplasy or reduce conflict. 
Combined analyses of nucleotides and gaps resulted in the nucleotide topology, but with 
increased support, suggesting that gap data may prove most useftil when analyzed in 
combination with nucleotide substitutions. 

Keywords: bird classification; avian phylogeny; nucleotide sequence alignment; 
total evidence; Columbiformes; Coraciiformes; Galliformes 



1. Introduction 

In DNA and protein sequence alignments, gaps are used to represent positions where insertion/deletion 
(indel) events have occurred, reflecting the absence of nucleotides or amino acids in specific sequences. 
Although indels accumulate in most genomic regions, they are more common in non-coding regions 
(e.g., introns) than in protein coding regions. Intron sequences have typically been used to examine 
relatively recent divergences (e.g., [1-5]), but there has been a growing appreciation that non-coding 
sequences also represent a rich source of phylogenetic information at deeper levels in vertebrate 
phylogeny. Indeed, non-coding data have been used to estimate phylogeny for a number of vertebrate 
orders (e.g., [6-8]) and classes (e.g., [9-13]). 

The process of multiple sequence alignment results in the concurrent inference of gaps that reflect 
the position of indels [14]. Inferred gap positions are often coded as binary characters that reflect the 
hypothetical positions where insertions or deletions have occurred (hereafter, called "gap characters," 
also see [9,15-18]), although more complex coding schemes are possible [19]. Regardless of the 
specific gap-coding scheme, including information about indels in phylogenetic analyses can 
increase the information available in multiple sequence alignments without requiring additional data 
collection [19,20]. In spite of this, few phylogenetic studies incorporate this information, usually 
treating gaps as missing data [21,22]. However, phylogenetic analyses that treat gaps as missing data 
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can be statistically inconsistent, even when the model of sequence evolution is simple and the true 
alignment is available [22]. Moreover, the historical information available from gap characters may be 
especially valuable, since they appear to exhibit less homoplasy than nucleotide substitutions 
(e.g., [4,12,20]). Thus, identifying the best methods for coding and analyzing gap characters (or finding 
other approaches to incorporate indels into phylogenetic analyses) represents an important challenge. 

Despite their potential value for phylogenetic analyses, gap characters also have the potential to be 
sources of error, just like other types of data. First, the multiple sequence alignment used to score the 
gap characters may be inaccurate. Alignment has a major impact upon phylogenetic estimation 
(e.g., [23-27]), even when gap characters are not analyzed. In fact, alignment error has been suggested 
to represent a ftindamental problem for the use of non-coding regions to address deep divergences 
(e.g., [28,29]), although when examined careftiUy it is clear that phylogenetic analyses of some 
non-coding data matrices are relatively insensitive to the details of alignment (e.g., [30,31]). Finally, 
the indels that underlie gap characters may exhibit homoplasy. Some analyses of gap characters have 
reported misleading signal associated with gaps (e.g., [32,33]), including evidence for long-branch 
attraction [34]. These issues are expected to introduce error into analyses of gap character matrices, 
suggesting that empirical studies that establish the relative amounts of historical signal and noise 
associated with gaps scored for alignments of different types of sequence data. 

The congruence of trees based upon gap characters and nucleotide substitutions for the same 
sequences can be used to assess performance of phylogenetic analyses of gap characters. Because gap 
characters typically exhibit less homoplasy than nucleotide substitutions (e.g., [20]), it is reasonable to 
hypothesize that gaps will have stronger phylogenetic signal than nucleotides. However, like other 
types of low homoplasy characters (e.g., [35]), changes in gap characters accumulate slowly, and this 
may limit their power to resolve difficult phylogenetic problems [36,37]. Most gap character matrices 
used in phylogenetic studies have been relatively small and, thus, have been unable to resolve 
phylogenetic relationships independently of nucleotide data. In fact, a recent study focused on avian 
phylogeny [38] included only 287 characters; analyses of those gaps alone were unable to resolve the 
avian tree. A few studies have used large numbers of gap characters [34,39], but those studies analyzed 
gaps in protein sequence alignments. Similar tests of the utility of gap characters from nucleotide 
sequence alignments of non-coding regions are desirable. 

A rigorous test of the hypothesis that the phylogenetic signal in gap characters is stronger than that 
in nucleotides also requires a phylogenetic problem that includes at least some difficult to resolve 
nodes. The Hackett et al. [13] data matrix (hereafter, called the "Early Bird" data matrix) included 
nearly 4 million base pairs (bp) of avian sequence data, most of which were non-coding (74% intron 
and 3% UTRs). The number of gaps (12,030 characters) in this data matrix exceeds that in previous 
studies of non-coding regions by at least an order of magnitude, though Hackett et al. [13] did not 
consider gaps in their analyses. As avian phylogeny has been a difficult problem to resolve, analyses of 
a large-scale matrix of gap characters based on the Early Bird [13] data should provide an excellent 
test of the utility of gaps for phylogenetic analyses. 

Here, we address five major questions about the utility of gap characters for phylogenetic analyses 
in avian non-coding regions. First, is the historical signal in the gap characters from Early Bird [13] 
stronger than, similar to or weaker than the signal in the nucleotide sequences? Second, do gap 
characters exhibit more or less homoplasy than nucleotides, and moreover, do gap characters based on 
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the insertion or deletion of a single nucleotide exhibit more homoplasy than those based upon longer 
indels? Third, are the trees supported by gap and nucleotide characters congruent, and if not, which of 
the two trees is better corroborated by other lines of evidence? Fourth, does maximum parsimony (MP) 
or maximum likelihood (ML) represent a better method for analyses of gap characters, or do both 
methods perform similarly? Finally, are total evidence analyses that combine gap and nucleotide 
data superior to individual analyses of either data type? We expect the answers to these questions to 
provide insight into the phylogenetic utility of gap characters that are largely based upon indels in 
non-coding regions. 

2. Methods 

2.1. DNA Sequence Data, Alignment and Gap Coding 

The Early Bird [13] data matrix comprises ~25 kilobases (kb) of sequence data per species (before 
alignment) from 19 nuclear loci obtained from 169 bird species (supporting information, file 1). The 
19 loci are located on 15 different chromosomes in the chicken genome [40], and they are likely to be 
unlinked in most or all avian lineages given the general conservation of avian karyotypes [41]. There 
was clear evidence that one locus (GHl) underwent a gene duphcation within birds [42]; a single GHl 
paralog was included for the taxa (Passeriformes) with two copies. Other details of the data matrix and 
alignment methods are provided in Hackett et al. [13] and Braun et al. [35]. 

The gap character matrix was generated using SeqState [43], which implements the simple indel 
coding method of Simmons and Ochoterena [19]. This method codes gaps as binary characters with 
"1" corresponding to presence of a gap (the absence of nucleotides) and "0" corresponding to absence 
of a gap (the presence of nucleotides). Gaps with different start and/or end positions are coded 
separately, and any gap that is enclosed within a longer gap is coded as missing ("?") for taxa with the 
longer gap. Three gap matrices were generated, one based upon all indels, a second with gap characters 
based on indels longer than 1 bp and a third with gap characters based on indels longer than 2 bp. All 
data matrices are available from the Early Bird web site [44]. 

2.2. Phylogenetic Analyses 

2.2.1. Parsimony Analyses 

We identified MP trees in PAUP* 4.0bl0 [45] using the parsimony ratchet [46]. Ratchet searches 
reweight a random subset of characters and conduct searches using those perturbed matrices, 
permitting a more thorough exploration of treespace (for a detailed explanation see Nixon [46]). For 
this study, the ratchet analyses used 100 iterations with 20% of informative characters perturbed and 
one tree held per iteration. To conduct the ratchet analyses, we used a C++ program (written by 
E.L.B.) that generates an appropriate PAUP* block. After conducting 100 ratchet iterations, the 
optimal trees were retained and tree bisection, and reconnection (TBR) branch swapping was 
conducted to identify the fiill set of MP trees. When we compared this strategy to a more typical tree 
search (random additions of taxa followed by TBR branch swapping), we found that the ratchet took a 
shorter amount of time and identified shorter trees. Ratchet bootstrap analysis used 500 replicates, each 
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of which used 100 ratchet iterations, as described above, with the final swapping limited instead to 
1,000 trees per bootstrap replicate. 

2.2.2. Likelihood Analyses 

Gap characters are binary, so a two-state Markov model (the Cavender-Farris-Neyman [CFN] 
model [47-49]) is appropriate for their analyses, at least in principle. However, all observed gap 
characters are by definition variable — ^their occurrence differs among taxa, otherwise they would not 
be discernible. Thus, gap characters exhibit an "acquisition bias" similar to that found in typical 
discrete morphological character matrices [50]. The acquisition bias for morphological data reflects the 
fact that most researchers only score parsimony informative characters; the failure to score uninformative 
characters is analogous to the inability to recover invariant gap characters (Felsenstein [51] referred to 
a similar phenomenon for restriction site data as "ascertainment bias"). Because of this issue, we 
employed a corrected CFN model that accommodates acquisition bias (we call this the CFNv model). 
The CFNv model is a special case of the more general Mkv model proposed by Lewis [50]; readers are 
referred to that publication for details. ML analyses using the CFNv model were conducted in PAUP* 
and GARLI v0.951 [52] after we converted the binary (01) gap characters to RY codes (O^R, l^Y). 

To correct acquisition bias in PAUP* and GARLI, we assumed that the observed variable 
characters (the gap matrix characters) were drawn fi*om a larger, hypothetical data matrix with an 
unknown number of invariant characters. Then we approximated this hypothetical matrix by appending 
invariant characters {i.e., columns that contain only "R" or "Y") to the observed gap matrix. Then, the 
number of invariant characters necessary to maximize the conditional likelihood [50] of the resulting 
gap data was estimated by systematically adding invariant characters and calculating the likelihood in 
PAUP* using the CFN model with F-distributed rates (called the CF+F model in that program). A Java 
program written by T.Y. was used to automate the addition of equal numbers of all R and all Y 
columns. The impact of correcting for acquisition bias was evaluated by analyzing the data without the 
added sites, but most analyses were conducted using the optimal number of added invariant characters. 

GARLI was used to search for the ML tree and to conduct likelihood bootstrap analyses. All 
analyses of gap data assumed equal state frequencies and a four-category discrete approximation to the 
F distribution (with the shape parameter estimated from the data). This corresponds to the CFNv+F 
model (for analyses with added invariant characters) or the CFN+F model (for analyses without added 
invariant characters). Up to 200 searches were conducted in GARLI to evaluate the ability of that 
program to identify the ML tree. 

2.2.3. Combined Analyses of Nucleotides and Gaps 

We analyzed the gap data combined with invariant characters and nucleotide sequence data using 
PAUP* (for the MP criterion) and GARLI v0.96p (for the ML criterion). GARLI v0.96p is capable of 
analyzing partitioned data. The gap partition was analyzed using the CFNv+F model, as described 
above, whereas the nucleotide data were analyzed using the general time reversible (GTR) model with 
F-distributed rates and invariant sites (the GTR+I+F model). We estimated bootstrap support using 
600 replicates. 
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2.3. Evaluating the Results of Phylogenetic Analyses Using Gap Characters 
2.3.1. Evaluating the Gap Phylogeny Using Congruence 

The best empirical method to assess the performance of novel phylogenetic methods or sources of 
phylogenetic information is to examine congruence with a known phylogeny [53] or, if such a 
phylogeny is unavailable, with topologies generated using independent data [30,54]. Unfortunately, 
most "known phylogenies" used to assess phylogenetic methods provide relatively weak tests, because 
they tend to include relatively easy to recover clades (see also Hastad and Bjorklund [55]). There are a 
number of strongly supported relationships in the avian tree of Hfe {i.e., those with 100% bootstrap 
support in Figure 1). These relationships were generally well supported and broadly accepted by avian 
systematists prior to the Early Bird study [56]; many of these clades correspond to orders in the 
Clements checklist [57] and the IOC World Bird List [58]. As these strongly supported relationships 
represent weak tests of phylogenetic methods, we will focus on the difficult to recover supra-ordinal 
clades present in the Early Bird tree. 

Relationships among avian orders have proven to be very difficult to resolve and parts of the Early 
Bird tree may prove to be inaccurate. However, we note that a subset of the supra-ordinal clades 
present in the Early Bird tree have been corroborated to varying degrees by independent lines of 
evidence (Table 1). These independent lines of evidence include the results of analyses using mitochondrial 
genomes [28,59], transposable element (TE) insertions [60-62] and DNA hybridization [63] or 
phylogenetic analyses of nuclear gene regions not included in the Early Bird study [30,31,64]. Thus, 
these nodes represent difficult tests for phylogenetic methods, but they can nonetheless be viewed as 
"known" (with at least some degree of confidence) given their independent corroboration. 

To facilitate discussion of the clades supported by the Early Bird tree, we have combined the 
classification used by Clements checklist [57] with a set of names for supra-ordinal clades (Table 1). 
The Clements classification was altered in two ways: non-mo nophyletic orders were split (in 
most cases, families were elevated to ordinal rank) and a broader circumscription (consistent with 
Wetmore [65] and the IOC World Bird List [58]) of Piciformes was used. In addition to facilitating the 
discussion of groups in this manuscript, we believe that the circumscriptions of ordinal and supra-ordinal 
clades that we present will be usefiil for two reasons: almost all orders are strongly supported by the 
bootstrap in the Early Bird tree, and the supra-ordinal clades can be mapped onto the commonly used 
checklists [57,58] in a straightforward manner. 

The supra-ordinal clades listed in Table 1 are defined as the least inclusive clade comprising the 
relevant species in the Early Bird tree (see supporting information, file 1). Although the International 
Code of Zoological Nomenclature does not regulate names above the family level we have adhered to 
priority for several groups as much as possible (see references in Table 1). The name "Australavis", 
the spelling published by Ericson [66], was modified to have the more appropriate ending "-es". 
Priority for Strisores [67] is also somewhat problematic, so an alternative name (Cypselomorphae) 
proposed for a less inclusive clade, but sometimes used as a synonym, is also included in Table 1 (see 
supporting information, file 2, for additional details regarding the nomenclature of this group), but we 
retain that terminology. We have also proposed names for as yet unnamed clades; etymology for those 
names is provided in the supporting information (file 2). 
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Figure 1. Estimate of avian phylogeny based upon nucleotide sequence data (maximum 
likelihood [ML] tree using the GTR+I+F model) and the higher-level classification 
described in the text. Nodes with 100% support are indicated with an asterisk. Red 
asterisks indicate nodes with 100% support that define supra-ordinal clades with extensive 
independent corroboration (see below). Coloring conventions here will be used in all trees, 
and named supra-ordinal clades are indicated using letters below branches (see Table 1 
for details). 
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Figure 1. Cont. 
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Six strongly supported supra-ordinal clades were omitted from Table 1 (indicated with red asterisks 
in Figure 1). Four of these clades correspond to the major divisions in the avian tree of life: Palaeognathae 
(Struthioniformes and Notopalaeognathae), Galloanserae (or Galloanseres [68]; Galliformes and 
Anseriformes), Neoaves (all other extant birds) and Neognathae (Galloanserae and Neoaves). These clades 
have received extensive independent corroboration (reviewed by Cracraft et al. [69]). Daedalornithes 
(Aegotheliformes and Apodiformes [70]) and Mirandomithes (Podicipediformes and Phoenicopteriformes 
[71]) are also very strongly supported. These groups are typically recovered in phylogenetic trees 
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based upon single genes (examples of individual gene analyses that support some or all of these groups 
include those based upon RAGl [72-74], EGRl [73] and up to 18 additional genes [13,75]). Thus, 
although these groups correspond to supra-ordinal clades, they do not represent difficult tests for 
phylogenetic methods. 

Table 1. Supra-ordinal clades in the Early Bird tree and their corroboration by independent 
evidence (from mitogenomics [28,59], analyses of nuclear regions [30,31,63,64] that were 
not used by Hackett et aL [13] and transposable element (TE) insertions [60-62]). Strong 
corroboration (bootstrap support >70% or >3 TE insertions) was indicated using and 
moderate corroboration (presence of the clade in with bootstrap support <70% or 1-2 TE 
insertions) was indicated using "+". Blank cells indicate that the available independent 
evidence could not address the presence or absence of the clade, whereas " — " indicates 
evidence contradicting the clade. Citations for the introduction of clade names are 
included; names without citations were introduced here (supporting information, file 2). 
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2.3.2. Estimating the Rate at Which Gap Character Changes Accumulate 

The rate of gap character change was estimated using ML estimates of branch lengths in the Early 
Bird [13] tree. Since branch lengths are expressed as substitutions per site (including invariant sites), 
estimates of branch lengths for gap characters include the added invariant characters. Thus, we 
multiplied the branch lengths based upon gap characters by the size of the gap character matrix 
(including the added invariant characters) and then divided by the size of the nucleotide matrix. This 
allowed the indel rate to be expressed as gap character changes per nucleotide site, making it directly 
comparable to nucleotide rates. 

2.3.3. Evaluating the Information Content of Gap Characters 

The phylogenetic information content of gap characters relative to nucleotide data was evaluated 
using the ML bootstrap support of each node for trees estimated using each data type, but restricted to 
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contain the same number of parsimony informative characters. When datasets differed in size, 100 
jackknife pseudomatrices were generated; these reduced the number of parsimony informative sites in 
the larger dataset to that of the smaller. Each of these 100 jackknifed pseudomatrices was then 
bootstrapped, and the average bootstrap support values were used. We compared four pairs of data 
matrices: (1) all gap characters (4,245 informative characters) compared to gap characters based upon 
indels >1 bp in length (3,160 informative characters); (2) all gap characters compared to gap characters 
based upon indels >2 bp in length (2,640 informative characters); (3) all gap characters compared to 
nucleotide substitution characters; and (4) all gap characters compared to RY-coded nucleotide 
substitution characters (making the nucleotide data binary, like the gap characters). 

3. Results and Discussion 

3.1. The Power of Gap Characters to Resolve the Avian Tree of Life 

Resolving the topology deep in the avian tree of life is a notoriously difficult problem [12,76,77], 
making it an excellent test case for novel sources of phylogenetic information. The power of specific 
types of data to resolve phylogenetic relationships depends upon the size of the matrix, rate of evolution, 
amount of homoplasy and branch lengths in the true tree. The ideal evolutionary rate for phylogenetic 
characters is rapid enough for a high probability of synapomorphic changes to occur on the shortest 
branches in the tree, but not so high that homoplastic changes obscure historical signal [36,37]. 
The nucleotide substitution rate for introns appears to be appropriate for analyses of deep avian 
phylogeny [12]. In contrast, gap characters accumulate at a much lower rate (the MP treelength given 
gap data are approximately 10% of the treelength given nucleotide data). The ML estimate of the gap 
accumulation rate is even lower (Figure 2), although the lower homoplasy of gap characters may prove 
advantageous if very large gap datasets were analyzed. 

Figure 2. Branch lengths estimated from gap data (using the CFNv+F model) plotted 
against branch lengths from all nucleotide data (estimated using the GTR+I+F model). 
Branch length estimates for specific nucleotide partitions (introns, coding exons and 3' 
untranslated regions [UTRs]) are presented for comparison of relative rates (next page). 
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To examine the phylogenetic signal in gap characters, we obtained estimates of the avian tree of Hfe 
based only upon gap characters (Figure 3 and supporting information, files 3 and 4). The gap tree had 
relatively high bootstrap support for most orders (Figure 3), the structure within orders (supporting 
information, files 3 and 4) and the small number of strongly supported supra-ordinal clades (i.e., the 
clades indicated with red asterisks in Figure 1), albeit often with lower bootstrap support than the 
nucleotide tree. Those supra-ordinal groups recovered in the gap trees (e.g., Novaeratitae, Picocoraciae, 
Picodynastomithes and Strisores) were much more poorly supported by the bootstrap in the gap 
character tree than they were in the nucleotide tree. Other independently corroborated supra-ordinal 
clades were not even present in the gap tree (e.g., Telluraves). However, there was also an interesting 
exception; McCormack et aL [64] found a strongly supported Eurypygiformes-Phaethontiformes clade. 
This clade is present in the gap trees. We have refi*ained fi*om suggesting a name for this clade, since it 
is absent fi-om the Early Bird tree and lacks independent corroboration, but it could be a case where 
analyses of gap characters exhibit better agreement with other sources of information than the analyses 
nucleotides conducted by Hackett et al. [13]. Overall, these analyses demonstrated that a large gap 
character matrix has sufficient phylogenetic signal to recover many of the most strongly corroborated 
nodes in the avian tree of hfe, but few of the most difficult nodes. 

Substantial branch length heterogeneity was evident in both the nucleotide and gap trees, and 
branch lengths appear to be somewhat correlated between the two data types (Figure 4). Several taxa 
have long branches relative to their close relatives, including Turnix (Charadriiformes), Tinamiformes 
(Paleognathae) and Phasianidae (represented here by the genera Coturnix, Gallus and RoUulus within 
the order Galliformes), in both the nucleotide and gap trees (Figure 4). This indicates that rates of 
nucleotide substitution and the accumulation of gap characters are correlated in birds, as expected 
based upon analyses of other groups of organisms (e.g., Hardison et aL [78]). 

This branch length heterogeneity may influence the estimate of topology, and it is tempting to 
speculate that the clustering of the long-branched Psittacopasserae and Picocoraciae within Telluraves 
reflects long branch attraction, especially given the short branches associated with the raptorial taxa 
(Accipitriformes, Cariamiformes, Falconiformes and Strigiformes) within this supra-ordinal clade. If 
so, the gap tree would actually provide a less accurate estimate of avian phylogeny than the nucleotide 
tree given that both Psittacopasserae and Australaves are paraphyletic in the ML gap tree but strongly 
supported by independent evidence [30,60]. 

The observed branch length heterogeneity suggests that ML methods might provide better estimates 
of avian phylogeny than MP, because parsimony equivalent models {i.e., the "no common mechanism" 
[NCM] model [79]) are unlikely to account effectively for branch length heterogeneity [80,81]. Indeed, 
it is clear that standard model selection approaches will indicate that the CFNv+F model has a better fit 
to the data than NCM [80], although we do note that there is debate regarding the question of whether 
MP should be viewed as a model [82]. Despite this prediction, our results are equivocal regarding the 
relative performance of these methods (e.g., compare Figure 3A to 3B). Indeed, the MP tree supports 
monophyly of Psittacopasserae and Austrodyptomithes (Figure 3B), unlike the ML tree, albeit with 
low (<50%) bootstrap support in both cases. The best interpretation of these differences between the 
MP and ML topologies is unclear, although differences between the trees at the supra-ordinal level 
provide no clear evidence that ML using the CFNv+F model performs substantially better than MP. 
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Figure 3. Estimates of avian phylogeny obtained using 12,030 gap characters obtained using 
(a) ML analyses with the CFNv+F model and (b) the maximum parsimony (MP) criterion. 
Orders were collapsed when monophyletic to simpHfy the trees. Bootstrap support on 
terminal branches reflects the support of those orders; orders represented by a single taxon 
are indicated using "(l)"- There were a limited number of rearrangements relative to the 
nucleotide topology within orders, most without bootstrap support. We highlighted the 
topology for the order GalHformes, because the gap topology included a clade with bootstrap 
support that conflicts with multiple nuclear gene regions [8,83] and morphology [84]. 
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Figure 3. Cant. 
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Figure 4. Branch length heterogeneity evident in the (a) optimal nucleotide tree (based 
upon the GTR+I+F model) and (b) the optimal gap tree (based upon the CFNv+F model). 




Examining other aspects of model fit, including conducting ML analyses without correcting for 
acquisition bias {i.e., using the CFN+F model), also resulted in similar topologies. These equivocal 
results are most likely to reflect the limited phylogenetic information in gap data matrices, even ones 
as large as that analyzed here. This suggests that it will be necessary to examine even larger data 
matrices to determine whether either analytical approach provides an adequate fit to the underlying 
process of indel evolution and to estabhsh the impact of these methods upon topology. 

3.2. Phylogenetic Signal in Gap Characters Based upon Indels of Different Lengths 

Above, we described two reasons why gap trees might have lower bootstrap support than the 
nucleotide tree. Specifically, the limited bootstrap support we observed could reflect the low rate of 
accumulation for gap character changes or poor model fit (alternatively, it could reflect a combination 
of both). Another possibility is that the gap data are sufficiently noisy that neither ML nor MP can 
recover an accurate estimate of the true tree. Even if noise is not positively misleading, it can have a 
negative impact upon the phylogenetic analyses [85]. Thus, noise reduction methods might provide a 
useful complement to model improvement. Short indels, especially 1 -bp indels, are more common than 
long indels in avian non-coding regions [10,86], suggesting that gap characters based upon short indels 
may contain more noise than those based upon long indels. Thus, the removal of short indels has the 
potential to enhance phylogeny reconstruction. 
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To examine the utility of noise reduction based on gap length, we filtered the full gap data matrix 
(12,030 characters of which 4,245 were parsimony informative) and excluded gap characters based on 
short (1- and 2-bp) indels. Removing 1-bp gaps reduced the matrix size by almost 25% (to 9,115 
characters; 3,160 parsimony informative), whereas excluding both 1- and 2-bp gap characters reduced 
the matrix size by an additional 11% relative to the original matrix size (to 7,740 characters; 2,640 
parsimony informative). Although the rate of longer gap accumulation was lower (the rate after 
excluding 1-bp gaps is 76% of that for the all gap matrix and the rate after excluding 1- and 2-bp gaps 
is 64%) all three data matrices exhibit similar levels of homoplasy (Table 2). Estimates of phylogeny 
obtained after removing short indels did not improve congruence with the nucleotide data tree 
(supplementary information, file 2). Robinson-Foulds distances [87] between the nucleotide trees and 
all of the gap trees ranged from 92 to 100, whereas the distance among gap trees ranged from 64 to 70. 
Removing short gaps may prove beneficial for other data sets, but these results showed that 1- and 
2-bp gaps did not contribute substantially to the noise in the gap dataset. 

Table 2. Retention indices [88] for gap characters and nucleotide data. Retention indices 
were calculated using the ML topologies for nucleotides (Figure 1) or gaps (Figure 3a). 



Topology 



Data Matrix 


Nucleotide tree 


Gap tree 


Gaps 






All 


0.7154 


0.7209 


>l-bp (excluding 1-bp gaps) 


0.7141 


0.7190 


>2-bp (excluding 1- and 2-bp gaps) 


0.7238 


0.7288 


Nucleotides 






All 


0.5231 


0.5188 


Introns 


0.5206 


0.5167 


Coding exons 


0.5315 


0.5251 


3' untranslated regions 


0.5632 


0.5597 



Not surprisingly, given the similar level of homoplasy in the full gap-data matrix and the filtered 
matrix with 1-bp gaps removed, bootstrap support in analyses using identical numbers of informative 
characters was similar in trees made from both data sets (Figure 5a). In fact, only four nodes exhibited 
fairly large changes in bootstrap support when 1-bp gaps were removed. In three cases, this was an 
improvement (from <50% to >70%); in the fourth case it was a decrease (from 69% to 12%). The node 
with reduced support united Picodynastornithes, a clade with independent corroboration (Table 1). In 
fact, Picodynastornithes was not present in the ML tree for gap data excluding 1-bp gaps; instead, the 
ML tree included a conflicting clade that comprised Coraciiformes and Bucerotiformes (supporting 
information, files 3 and 4). Similar results were obtained when both 1-bp and 2-bp gaps were excluded. 

Surprisingly, the rearrangement within Picocoraciae observed when long gaps were excluded unites 
the "traditional" Coraciiformes. Morphological support for traditional Coraciiformes is mixed; traditional 
Coraciiformes form a clade in the analyses of Livezey and Zusi [89], whereas the analyses of Clarke 
et al. [90] conflict. However, we found it provocative that the gap trees support a Momotus-Todus 
clade, a topology that agrees with some morphological analyses [90,91] and conflicts with analyses of 
nucleotide data (Figure 1). However, none of the analyses of gap data had bootstrap support >70% for 
the Momotus-Todus clade. 
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Figure 5. (a) Comparison of bootstrap support in trees based on all gap characters and gap 
characters >l-bp in length. Bipartitions that appeared well supported (>70% bootstrap) by one 
analysis and poorly supported (<50% bootstrap) in the other are shaded. Numbers correspond 
to the following bipartitions: 1. Ardea-Cochlearius-Eudocimus\ 2. Alisterus-Psittacula\ 
3. Chalcopsitta-Platycercus\ and 4. Picodynastomites. (b) Comparison of bootstrap support for 
analyses using all gap characters and RY-coded nucleotide data. The same numbers of 
informative characters were used in each of these analyses (next page). 
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In contrast to the modest differences between analyses using different gap data matrices, much 
larger differences were observed when we compared bootstrap support from the nucleotide and gap 
trees (Figure 5b). This observation does not reflect differences in state space because the nucleotide 
data were RY-coded to address the more Hmited character state space in the binary gap characters. 
These results suggest the existence of both congruent and incongruent signals in the gap and nucleotide 
data and indicate that the incongruent signals in the gap data were not disproportionately associated 
with gaps based upon the shortest indels. 

3.3. Combined Analyses of Nucleotide Substitutions and Gap Characters 

ML analysis of the combined nucleotide and gap character data (including invariant gap characters) 
resulted in an estimate of phylogeny (Figure 6) virtually identical to the nucleotide tree (Figure 1). In 
general, there was a modest increase in the average bootstrap support for groups in the partitioned ML 
analyses of nucleotide substitutions and gap characters (Figure 6). However, there were also five nodes 
that exhibited more substantial increases in bootstrap support (>10%); four corresponded to 
supra-ordinal clades (Figure 6) and the fifth to the Balaeniceps-Scopus clade in Pelecaniformes (which 
increased to 75%). This general increase in support is consistent with the general assumption that 
including indel information in phylogenetic analyses would prove useful. 
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Figure 6. Combined evidence estimate of the avian tree of life. A partitioned ML analysis 
was conducted using the GTR+I+F model for the nucleotide partition and the CFNv+F 
model for the gap partition. Arrows indicate nodes defining supra-ordinal clades where 
bootstrap support increased or decreased by more than 10% relative to the nucleotide 
analysis (Figure 1). The combined evidence topology for Columbiformes was congruent 
with the gap topology instead of the nucleotide topology (inset; bootstrap values are 
reported for combined analysis [above branches] and for gap characters [below branches]). 
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There were also four nodes in the combined evidence tree that exhibited fairly large (>10%) 
decreases in bootstrap support. These decreases were evident for three supra-ordinal groups (Figure 6) 
and the Dendrocolaptes-Scytolopus clade in Passeriformes (which decreased to 66%). There was 
another difference between the nucleotide and combined evidence trees within Columbiformes. The 
combined evidence topology for this order corresponded to that in the gap tree, where the relevant 
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branches had even higher bootstrap support (Figure 6). Although the majority of differences between 
the nucleotide tree and the gap tree are likely to reflect the more limited power of gap characters to 
resolve phylogeny, these differences are likely to indicate the existence of conflicting phylogenetic 
signals in nucleotide substitutions and gap characters. These conflicts are likely to highlight nodes in 
the Early Bird tree [13] that should receive additional scrutiny. 

There were two nodes with high bootstrap support in both the nucleotide (Figure 1) and gap trees 
(Figure 3) that conflicted; in both cases, the total evidence tree (Figure 6) was consistent with the 
nucleotide tree. Surprisingly, given the lower homoplasy of gap characters relative to nucleotide data 
(Table 2), independent evidence suggested that the nucleotide tree was more likely in both cases: 

1. The nucleotide tree supports the monophyly of Notopalaeognathae in contrast to both the MP 
and ML gap trees (Figure 3), although only the latter had high bootstrap support. The 
nucleotide topology is strongly supported by independent evidence, including reanalyses of 
complete mitochondrial genomes [29], analyses of independent nuclear data matrices [31], TE 
insertions [62] and analyses of morphological data. 

2. The nucleotide tree supports a clade comprising New World quail (Colinus) and Phasianidae 
within Galliformes (Figure 1), whereas the gap tree supports a clade comprising Guineafowl 
(Numida) and Phasianidae (Figure 3B). The former topology is supported by analyses of 
multiple nuclear and mitochondrial sequences [8,83], TE insertions [92] and morphology [84]. 

The combined evidence tree was virtually identical to the nucleotide tree, probably reflecting the 
ability of rapidly accumulating nucleotide changes to overwhelm the analysis. Nonetheless, the signal 
in gap characters appears to have an influence, because several supra-ordinal clades exhibited 
increases or decreases in bootstrap support >10% relative to the nucleotide tree. Support for the 
unnamed clade uniting Novaeratitae and Tinamiformes increased substantially. Although the existence 
of this clade is supported by analyses of complete mitochondrial genomes [29] analyses of independent 
nuclear data [31] were equivocal and two TE insertions [62] conflicted with the clade (there were no 
TE insertions consistent with the combined analysis). Likewise, support for Insolitaves also increased, 
although there is no independent evidence supporting this clade (Table 1). Finally, support for 
Accipitriformes, one the few orders with limited bootstrap support, also increased (from 58% to 71%). 
In contrast, two supra-ordinal clades with independent corroboration (Australaves and Eucavitaves) 
exhibited decreased support. Neither of those two clades appeared in the gap tree (Figure 3). The 
decreased support for Australaves and Eucavitaves in the combined evidence topology is consistent 
with the hypothesis that the nucleotide and gap data exhibit some genuine (albeit limited) conflict. 

3.4. Analyses of Gap Characters and Models oflndel Evolution 

The conflicts between the gap and nucleotide data may reflect the poor fit of the models we used for 
analysis. Better models of indel evolution are clearly desirable, because the actual patterns of indel 
evolution are no doubt more complex than the combination of gap coding and analyses using the 
CFNv+F model or parsimony-equivalent models. Indeed, it is unlikely that any of the models used in 
phylogenetics have a perfect fit to the underlying processes of sequence evolution. Nonetheless, 
approximating models have proven very usefiil for phylogenetic estimation (see Sullivan et al. [93] 
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and Huelsenbeck et al. [81] for additional discussion). Thus, we felt that the simple ML approach we 
used represented a reasonable starting point that should be tested. However, we did not find this simple 
ML method performed substantially better than analyses using the MP criterion, suggesting future 
studies should explore more complex models. 

Models of sequence evolution have improved along with our understanding of the processes of 
sequence evolution [81]. This raises the question of which aspects of indel evolution might prove to be 
most important for improving models of indel evolution. Although short indels are more common than 
long indels [10,86], we found that filtering the data matrix to remove short indels did not improve 
congruence, raising questions about the value of incorporating this correlation into models of indel 
evolution. The existence of a deletions bias has been established both for birds [17,86,94] and 
mammals [11], and incorporating this asymmetry might be usefiil. Indeed, asymmetry should be 
intrinsic to models of indel evolution; sequence ahgnments that represent evolutionary history 
accurately can include homoplastic deletions, but homoplastic insertions should be forbidden (since 
distinct insertion are, by definition, not homologous; e.g., Alekseyenko et al. [95]). Although more 
complex and realistic models that combine sequence and indel evolution in this manner have been 
proposed [95], it is unclear they can be implemented in a way that will prove to computationally 
tractable for phylogenies of this size. It also remains unclear whether these more complex models 
capture all of the relevant features of indel evolution, but the observation that analyses excluding indel 
information can be positively misleading [14,22] suggests that development of improved models of 
indel evolution remains critical. 

Another aspect of model fit that should not be ignored is the assumption that a single tree underUes 
the observed distribution of gaps. Gene trees can differ fi*om the species tree for several reasons [96]; 
for avian phylogeny, the most common reason is probably deep coalescence. The short branches at the 
base of Neoaves (Figure 4) suggest that incomplete lineage sorting due to deep coalescence was 
common during the radiation of this group [97]. The distribution of TE insertions is consistent with 
incomplete lineage sorting [60,61]. Discordance among gene trees is known to lead to the incorrect 
estimation of species trees when concatenated analyses are conducted [98], and we expect 
concatenated analyses of gaps fi^om multiple loci to inherit all of the properties of similar analyses that 
use nucleotide data. Although nucleotide and gap data refiect the same genes and, therefore, the same 
set of gene trees, the number of gap characters and variable nucleotides differs among loci. These 
differences in the number of characters in each partition effectively result in differential weighting of 
loci in the gap and nucleotide trees and, therefore, create the potential for analyses of nucleotides and 
gaps to recover different topologies. 

4. Conclusions 

Our analyses indicated that a gap data matrix of more than 12,000 characters was unable to resolve 
the majority of difficult relationships in the avian tree of life (and, thus, were not clearly superior to 
nucleotide data), although the data did appear to improve bootstrap support when combined with 
nucleotide data. As expected, gaps accumulated much more slowly than nucleotide substitutions and 
this low rate likely limited their power for phylogenetic reconstruction. Rates of gap accumulation also 
differed among taxa in a manner correlated with the rate of nucleotide substitution. The observation 
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that rates of gap accumulation differed among taxa suggested that model-based analyses {i.e., ML with 
the CFNv+r model) might improve phylogenetic analyses of indels. However, we found only modest 
differences in performance between MP and ML. Additionally, removing short and potentially more 
homoplasious gaps did not improve tree reconstruction. Since the rate of gap character change is 
approximately an order of magnitude slower than the nucleotide substitution rate, it seems likely that at 
least an order of magnitude more data will be necessary to provide sufficient information to resolve the 
avian tree of life using indels alone. These larger indel datasets are likely to be available from birds 
very soon, and they should have the potential to contribute to the development of better models of 
indel evolution, improving fixture studies that include gap characters in many groups of organisms. 

Additional Note 

While this manuscript was being reviewed Joel Cracraft generously provided a preprint describing 
the taxonomy used in the forthcoming Howard and Moore Complete Checklist of the Birds of the 
World [99]. I have included a brief description of the differences between the clade names used in that 
taxonomy and those used here in the supporting information (file 2). 
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